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(54) System and methods fbr automatic call and data transfer processing 



(57) A progranriniable automatic call and data trans- 
fer processing system which automatically processes 
incoming telephcvie calls, facsimiles and e-ntails tsased 
on the identity of the caller or autiior, the subject matter 
of the message or request, and/or thetime of day, which 
includes: a central server for automatically answering 
an incoming call and collecting voice data of a caller; a 
speaker recognition module connected to tiie server for 
Identifying the caller or author; a switching module 
responsive to the speaker recognition module for 
processing the can or message in accordance witii a 
preisrogrammed procedure based on the identification 
of the caller or author; and a programming interface for 
programming the server, speaker recognizer module 
and the switching module. The system is programmed 
tiy the user to so as to process Incoming telephone calls 
or e-mail and l^mile messages based on the Mentity 
of tiie caller or author, subject matter and content of the 
message and tiie time of day Such processing 
includes, but is not limited to. switching the call to 
anottier system, fonwarding tiie call to another tele- 
phone terminal, placing the call on hoM. or disconnect- 
ing tiie call. In anotiier aspect of the present invention, 
the system may be employed to process information 
retrieved from other telecommunication devices such as 
voice mail, facslmile/imodem or e-mail. The system is 
capable of tagging ttie identity of a caller or participants 
to a teleconference, and transcribing the teleconfer- 
ences, phone conversations and messages of such call- 
ers and participants. The system can automatically 



Index or prioritize the received calls, messages, e-mails 
and facsimiles according to the caller kJentiftcation or 
subject matter of the conversation or message, and 
allow the user to retrieve messages that eitiier origi- 
nated from a specific source or caller or retrieve calls 
which deal wifli similar or specific subject matter. 
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Description 

[0001] The present invention relates to a system and 
methods for providing automatic call and data transfer 
procesang and, more particularly, to a system and 
methods for providing automate call and data transfer 
procesang according to a pre-programmed procedure 
based on the identity of a caller or author, the subject 
matter and content of a call or message and/br the time 
of day of such call or messaga 
[0002] Generally, in the past, call processng has been 
manually performed either by a business owner, a sec- 
retary or a local central phone service. There are cer- 
tain conventional devices wttich partially perform some 
call processing functions. For e}cample, conventional 
answering machines and voice-mail services record 
incoming telephone messages which are then played 
back by the user of such devices or services, bi addi- 
tion, desktop-telephone software or local PBXs (private 
branch exchange) provide telephone network switching 
capabiiiti^ These conventional answering nrachines. 
voice-mail s&vices and switching systems, however, 
are not capable of automatically perfbmiing distinct 
processing procedures that are responsive to the iden- 
tity of the caller or evaluating the content or subject mat- 
ter of the call or message and then handling such call or 
message accordingly. Instead, tiie user must first 
answer his or her telephone calls manually, or retrieve 
such calls from an answering rrachine or voice-mail, 
arid tiien decide how to pnx^eed on a call-by-call basis. 
The present invention eliminate or mitigates such bur- 
densome nnnual processing. 
[0003] I\/ioreover, although protected by Dual Tone 
Multi-Requency (DTMF) keying, answerir^ machines 
and voiciB^I Sjsrvices are unable to klentify or verify 
the caller when being remotely accessed or re-pro- 
grammed by a caller with a valid persor^ identification 
nun^er (PIN) which is inputted by DTMF keys. Further, 
conventional teleconference centers also rely on DTMF 
PINs Ibr aceessibirrty but are unable to verify and tag tiie 
kJentity of tiie weaker during a teleconfer&ice. Such 
answering machines, voice-mail and teleconference 
centers rmy therefore be breach^ by unauthorized 
persons witii access to an otiiemvise valid PIN. 
[0004] It is ttierelbre an object of tifie present inventkm 
to provide a system and metiiods for automatic call and 
data transfer processng in accordance witti a pre-dete'- 
mined manner bas&i on the identity of the call^ or 
author, ttie subject matter of tiie call or message and/or 
the time of day. 

[0005] It is anottier object of tiie present invention to 
provkie a call processing system which can first tran- 
scribe messages received by telephone, facsimile and 
e-mail, as well as other data elecb'onically received by 
the systm, then tag tiie identity of the caller (or partici- 
pants to a teleconference) or the auttior of such e-mail 
or facsirnile messages, and then index such calls, con- 
versations and messages according to tiieir origin and 



subject matter, whereby an authorized user can ttien 
access the system, either locally or remotely, to play- 
back such telephone conversations or messes or 
retrieve such e-mail or facsimile messages in ttie form 

5 of syntiiesized speech. 

[OOOS] It is yet another object of the present invention 
to provide a system ttiat is responsive (i.e.. accessible 
and programmable) to voice activated commands by an 
autiiorized user, wherein tiie system can identify and 

10 verify the user before allowing the user to access calls 
or messages or program the system. 
ffM^ In one a^ect of the present invention, a pro- 
gramrnable automatic call and message processing 
system comprises: server mrans for receiving an 

IS Incoming call; speaker recognition means, operatively 
coupled to ttie server means, for identifying the caller; 
^eech recognition means, operatively coupled to ttie 
server mearis. fbr determining sul^ect matter and con- 
tent of the call; switching means, re^nsrve to ttie 

20 speaker recognition means and ^eech recognition 
means, fbr processing ttie call in accordance with ttie 
identity of the caller and/or ttie subject matter of the call; 
and programming means, operatively coupled to ttie 
server means, speaker recognition means, speech rec- 

25 ognition means and tiie switching means fbr program- 
ming the system to perform the processing. 
[0008] The system is preferably programmed by the 
user so as to process incoming tel^hone calls in a pre- 
detemniried manner based on ttie identity of the caller. 

30 Such processing includes, but is not limited ta switching 
the call to another system, fonvarding the call to another 
telecommunication terminal, directing the call to an 
answering n^chine to t>e recorded, placing tiie call on 
hold, or disconnecting the call. 

35 [D009] In anottier aspect of ttie present invention, ttie 
system miay be pre-programmed to process ah incom- 
ing telephone call, facsimile or e*mail message accord- 
ing to their content, subject matter, or according to ttie 
time of tiie day they are received. Still furtiier, tiie sys- 

40 tern may preferably be programmed to process an 
incoming telephone call, facsimile or e-mail message 
according to a combination of such factors, i.e., the 
identity of tiie caller, ttie sut)ject nr^tter and content of 
tiie call and the time of day. In addition, e-mail mes- 

45 sages (and ottier messages aeated by appfication spe- 
cific software such as LOTUS NOTES) may be 
processed in accordance with mood stamps, l.a, infor- 
mational fields provkied by certain mailing programs 
such as LOTUS NOTES which allow ttie sender to indi- 

50 cate ttie nature of tiie message such as ttie oonfUenti- 
aGty or urgency of ttie message. For future e-mail or 
data exchange technkiues, such information can be 
included in a header of the e-mail or facsimile. Further, 
ttie syst&Ti rmy be programmed to prompt the caller to 

^ expfidtiy advtee the system of tiie nature of the mes- 
sage. Still furtiier, the system ma^ be configured to 
retrieve and process data from ottier telecommunication 
devtoes such as vok;e mail systems or answering 
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machines. 

[Oai 0] In stiR a further aspect of the present invention, 
the call processing system of the present invention is 
capable of tagging the identity of a caller or the partici- 
pants to a teleconference, while transcribing the mes- 
sage or conversations of such callers and ii^cqpants. 
Consequently, the syst&n can autonrmtically manage 
telephone messages and conversations, as well as 
voice mail, e-mail and facsimile messages, by storing 
such calls and messages according to their subject mat- 
ter or the identity of the caller or author, or both. Specif- 
ically, the present invention can, in combination with 
such identification and transcription, automatically index 
or prioritize the received telephone calls and e-mail and 
facsimile messages according to their origin and/or sub- 
ject matter which allows an authorized user to retrieve 
specific messages, e.g., those messages that origi- 
nated from a specific source or those wTvch deal with 
similar or specific subject matter. 
[001 1] In another aspecX of the present invention, the 
system includes text-to- speech capabilities which 
allows the system to pronpt (i e., query) the user or 
ca\\& in the form of synthesize speech, to provide 
answers to questions or requests by the user or caller in 
synthesized speech and to playt>ack e-mail and facsim- 
ile messages in synthesized ^eech. The system also 
includes ptayt)ack capabilities so as to playt)ack 
recorded telephone messages and other recorded 
audio data 

[OOi Zl These and other objects, features and advan- 
tages of the present invention will become apparent 
from the following detailed description of IDustrative 
embodiments thereof, which is to be read in connection 
with the accompanying drawinga 

Rg. 1 is a block diagram illustrating general func- 
tions of an automatic call and data transfer process- 
ing system in accordance with the present 
invention; 

Rg. 2 is a block diagram, as well as a flow diagram, 
illustrating the functional interconnection between 
modules for a call and data triansfer processing sys- 
tem in accordance with an errbodiment off the 
present invention; and 

Rgs. 3a and 3b are flow diagrams illustrating a 
method for call or data transfer processing in 
accordance with the present inventk)n. 

[001 3] Referring to Rg. 1 . a block diagram illustrating 
general functions of an automatic call and data transfer 
processing s^m of the present invention is shown. 
The present invention is an automatic call and data 
transfer processing machine that can be programme 
by an authorized user (block 12) to process incoming 
telephone calls in a manner pre-determine by such 
user Although the present invention nr»y be employed 
to process any voice data that may be received tiirough 
digital or analog channels, as well as data received 



electronically and othenftflse convertible into readable 
text (to be further explain^ below), one embodiment of 
the present invention involves the processing of tele- 
phone communications. Particularly, the system 10 will 

5 automatically answer an incoming telephone call from a 
caller (block 14) and. depending upon the manner In 
which the system 10 is programme by the user (block 
12), the syst&n 10 may process tiie telephone call by. 
for example, switching the call to another telecommuni- 

10 cation system or to an answering machine (Block 18). or 
by handling tiie call directiy, e.g., by connecting, discon- 
necting or placing the caller on hold (Block 16). in addi- 
tion, the system 10 may be programmed to route an 
incoming telephone call to various telecommunication 

75 systems in a specific order (&g.. directing the call to 
several pre-determined telephone numbers until such 
call is answered) or simultaneously to all such systems. 
It is to be understood that the tdecommunication sys- 
tems listed in block 18. as well as the options shown in 

20 block 16 of Rg. 1, are merely illustrative, and rrat 
exhaustive, of the processing procedures that the sys- 
tem 10 may be programmed to perform. 
[0014] In another embodiment of the present inven- 
tion, the system 10 rnay be programmed to process 

25 incoming facsimile arid e-mail messages, or automati- 
cally retrieve messages from e-mail or voice mail sys- 
tems. Thus, it is to be understood that the bidirectional 
lines of Rg. 1 connecting the system 10 to the telecom- 
munication systems in K3lock 18 (e.g., e-mail, voice mail, 

30 facstmile/imodem and answering machine) indicates 
that the system 1 0 is designed to send data (ag., calls 
or messages) to such systems, as well as retrieve and 
process data ^or^ or recorded in six^h systems. For 
instance, the system 1 0 may be programmed to process 

35 a particular call by directing the call to an answering 
machine (block 18) to be recorded. The system 10 may 
subsequently retrieve the rmrded message from the 
answering machine, which is tiien decode and proc- 
essed by the system 10 in a particular manner. Further. 

40 the system 10 can be programmed to trai^form an 
incoming telephone call or messages into a page which 
can then b& transmitted to the user's pager, cellular 
phone or e-mail. 

[0015] The functional modules of the system 10 and 
45 their specific interaction in accordance with an embodi- 
ment of the present invention will be explained below by 
reference to Fig. 2. It is to Ise uhderstood that same or 
^milar components illustrated throughout the figures 
are designated with the same reference numeral. It is to 
50 be furth^* understood that tiie functional modules 
described herein in accordance witii the present inven- 
tion may be implemented In hardware, software, or a 
combination thereof. Preferably, the main speech and 
^>eaker recognition, language identification modules 
55 and indexing modules of present invention, for example, 
are implemented in software on one or more appropri- 
ately programmed g&ieral purpose digital computer or 
computers, each haying a processor, associated mem- 
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ory and input/output interlaces for executing the ele- 
ments of the present invention. It should be understood 
that while the Invention is preferably implemented on a 
suitably programmed general purpose computer or 
conputers, the functional elements of Fig. 2 may t>e 
considered to include a suitable and prefer^ proces* 
sor architecture for practicing the invention and are 
exemplary of functional elements which m^ be imple- 
mented within such computer or computers through 
programming. Further, the functional elements of Rg. 2 
may be implements by programming one or more gen- 
eral purpose microprocessors. Of course, special pur- 
pose microprocessors may be employed to implement 
the invention. Given the teachings of the invention pro- 
vided herein, one of ordinary skill In the related art will 
be able to contemplate these and similar implemenla- 
tions of the elements of the invention. 
[C016] Referring nowto Fig. 2, the system 10 includes 
a server 20 preferably connected to various telecommu- 
nicaton systems Including, but not limited to, one or 
more telephone lines (block 14) and one or more fac* 
simile and a modem lines (Figs. 1 and 2. block 1€) for 
receiving and sending telephone calls and message 
data, respectively. TTie server 20 is progranvned to 
automatically answer incoming telephone calls and 
receive incoming facsimile transmissions. The system 
10 may also include a permanent internet/intranet con- 
nection for accessing a local network mail server, 
whereby the server 20 can.be programmed to periodi- 
cally connect to such local network nail server (via 
TCP/IP) to receive and process incoming e-nr^'ls. as 
well as send e-mail messages. Alternatively, if the sys- 
tem 10 is not permanently connected to a local network 
server, the system server 20 may be programmed to 
periodically dial an access number to an internet pro- 
vider to retrieve or send e-nrail messages. Such proce- 
dures may also be perform^ at the option of the user 
(as opposed to automatically monitoring such e-mail 
accounts) when the user accesses the system 10. 
[0017] Further, as shown in Figs. 1 and 2 (block 18), 
the server 20 may be directly connected to voice mail 
systems and answering machines so as to allow the 
user to retrieve and process messages that have been 
recorded on such voice-mail and answering machine 
systems. If the system 10 is connected to a local net- 
work system, the server 20 may be programmed to peri- 
odically retrieve messages from other voice mail 
systems or answering machines which are not directly 
connected to the server 20, but othenwise accessit)le 
through the k)cal networK so that the system 10 can 
then automatically monitor and retrieve messages from 
such voice nail systems or answering machines. 
[0018] The server 20 includes a recorder 40 for 
recording and storing audio data (e.g.. incoming tele- 
phone calls or messages retrieved from voice maB or 
answering machines), preferably in di^ fbrm. Ftrther- 
more. the server 20 preferably includes a conrpres- 
sionAdecorr^ession module 42 for compressing the 



digitized audio data, as well as message data received 
via e-mail and facsimile, so as to inaease the data stor- 
age capability of a memory (not shown) of the system 
10 and for decompressing such data before reconstruc- 

5 tion when such data is retrieved from memory, 

[0019] A speaker recognizer module 22 and an auto- 
matic speech recognizer/natural language undastand- 
ing (ASR/NLU) module 24 arie operatively coupled to 
the server 20. The speaker recognizer module 22 deter- 

10 mines the identity of the caller 14 and participants to a 
conference call from the voice data receive by the 
server 20, as well as the author of a received facsimile 
or e-ma3 message. The ASR/NLU module 24 converts 
voice data and other message data received from the 

IS server 20 into reajdable text to detenmtne the content 
and subject matter of ^ch calls, conversations or mes- 
sages. In addition, as further demonstrate t)elow. the 
ASR/NLU module 24 processes vert)ai oomnrands from 
an authorized user to renotely program the system 1 0. 

20 as well as to g^erate or retrieve mes^ges. The 
ASR/NLU module 24 also processes voice data from 
callers and authorized users to perfonm interactive voice 
response (IVR) functions. A language identifier/transla- 
tor module 26, operatively conn^ed to the ASR/NLU 

25 module 24, is provided so that the system 10 can under- 
stand and property respond to messages in foreign lan- 
guage when the system is used, for example, in a multi- 
language country such as (Canada 
[D02Q] A switching module 28. operatively coupled to 

so the speaker recognizer module 22 and the ASR/NLU 
module 24. processes data received by the ^}eaker rec- 
ognizer module 22 and/or tiie ASR/NLU nrtodule 24. The 
switching nrtodule perfbmris a processing procedure with.. 
. respect to incoming telephone calls or facsimile or e- 

35 mail messages (e.g., directing a call to voice-mail or 
answering nachine) in accordance wrHi a pre-pro- 
grammed proc^ura 

[0021] An identification (ID) tagger module 30, opera- 
tively connected to the speaker recognizer module 22, 
40 is provided for electronically tagging the identity of the 
caller to tiie caller's message or conversation or tagging 
the identity of the author of an e-nall or facsimile mes- 
^ge. Further, when operating in the background of a 
teleconference, the ID tagger 30 will tag the identity of 
45 the person cunrentiy speaking. A transcriber module 32, 
operatively connected to the ASR/NLU module 24. is 
provided fbr transcrtoing the telephone message or con- 
versation, teleconference and/or facsimile message. In 
addition, the transcriber module 32 can transaibe a ver- 
so bal message dictated by the user, which can sut>se- 
quentiy be sent by ttte system 10 to another person via 
telephone, facsimile or e-nrail. 
[0022] An audio indexer/j^riorrtizer module 34 is oper- 
atively connects to ttie ID tagg^ module 30 and the 
55 transcriber module 32. The audio indexer^oritizer 
module 34 stores tiie transcription data and caller iden- 
tification data which is processed by ttie transcriber 
module 32 and the ID tagger module 30. respectively. 
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as well as the time of the call, the originating phone 
number (via automatic number identification (AN!) if 
available) and e-mail address, in a pre-programmed 
manner, so as to allow the user to retrieve specific calls 
or messages from a particular party or those calls or 5 
messages which pertain to ^ecrfic subject matter. Fur- 
ther, the audio indexer^rioritizer can be programmed to 
prioritize certain calls or messages and inlbrm the user 
of such calls or messages. 

[0023] A ^eech synthesizer module 36. operatively 10 
connected to the audio index^i^riorrtizer module 34, 
allows the user to retrieve messages (e-mails or facsim- 
iles) in audio form (i e., synthesized ^ech). The 
speech synthesizer is also operatively coupled to the 
ASR/NLU module for providing system prompts (i.e.. is 
queries) in the form of synthesize speech (as opposed 
to being displayed, for example, on a computer nmni- 
tor). 

[0024] A programming interface 38. operatively cou- 
pled to tiie server 20, speaker recognizer module 22. 20 
language identifier/translator module 26, ASR/NLU 
module 24, audio indexer^rioritizer module 34 and the 
switching module 28, is provided for programming the 
system 10 to process calls and messages in accord- 
ance with a pre-determined procedure. As explained in 2s 
detail below, a user may program the system 10 using 
the progranvning interface 38 through either voice com- 
mands or a GUI (graphical user interface), or both. In a 
preferred embodiment, the system 10 is programmed 
by verbal commands from the user O.e.. voice command so 
mode). Specifically, the user may program the system 
i 0 with vert)al commands either remotely, by calling into 
the system 1 0, or locally with a microphone. The pro- 
gramming interface 38 is connected to the server 20 
which, in conjunction with the speaker recognizer mod- 3s 
ule 22 and ttie ASR/NLU module 24. verifies ttie ^entity 
of the user before processing the vert)al programming 
commands of the user. The system 10 may either dis- 
play (via the GUI) or play back (via ttie ^eech synttie- 
sizer 36) Information relating to the vertxil programming 40 
commands (ue., whettier ttie system 10 recognizes 
such command), as well as the cun-ent programming 
structure of ttie system 1 0. 

[0025] In another embodiment, the system 1 0 may be 
programmed locally, through a PC and QUI saeen or 4S 
programmed remotely, by accessing the system 10 
through a computer network from a remote location. 
Similar to conventional windows interlace, the user may 
program the system 1 0 by selecting certain fields which 
may be di^layed on the GUI. It is to be appreciated that so 
the system 10 may be programmed ttvough a combina- 
tion of voice commands and a QUI. In such a situation, 
the GUI may. for example, provide assistance to the 
user in giving the r^isite voice commands to program 
ttie system 1 0. Still further, the system 1 0 nay be pro- ss 
gramme by ecfiting a con'esponding programming con- 
figuration fie which controls ttie functional modules off 
Rg. 2. 



[0026] The operation of ttie present invention will now 
be described witti reference to Fig. 2 and Rgs. 3a and 
3b. It to be understood ttiat ttie depiction of ttie 
present invention in Rg. 2 could be consider^ a flow 
chart for illustrating operations of the present invention, 
as well as a k^lock diagram sN>wing an embodiment of 
the present invention. The server 20 is programmed to 
automatically answer an incoming telephone can, e- 
nail, l^csimlleAnodem, or ottier electronic vdce or mes- 
sage data (step 100). The server 20 distinguishes 
between incoming telephone calls, e-mail message, 
facsimile messages, etc., by special codes* i.e. proto- 
cols, at ttie beginning of each message which indicates 
the sourca Particularly, ttie sender 20 initially assurnes 
ttiat ttie inconiing call is a telephone communication and 
will proceed accordingly (step 1 10) unless the saver 20 
receives, for example, a modem handshake signal, 
whereby the system 10 will handle the call as a compu- 
ter connection protocol. It is to be understood ttiiat ttie 
system 10 may be programmed to monitor ottier voice 
mail or e-mail accounts by periodically calling and 
retrieving voice mail and e-mail messages from such 
accounte. 

IP027] If it is determined that ttie incoming call 
received tiie server 20 is a telephone call, ttie audio 
data (e.g., incoming calls as well as calls retrieved from 
voice mail or answering machines) is recorded by the 
recorder 40 (step 112). The recorder 40 may be any 
conventional device such as an anatog recorder or dig- 
ital audio tape fDAr). Preferably, ttie recorder 40 is a 
digital recorder, i.e., an analog-to<ilgital converter for 
converting the audio data into digital data. The digitized 
audio data rnay then be compressed k)y the compres- 
sionAdecompression module 42 (step 114) before being 
stored (step 116) in memory (not shown in Rg. 2). It is 
to be apixedated ttiat any conventional algorithm, such 
as those disclosed in "Digital Signal Processing, Syn- 
ttiesis and Recognition" by S. Furui, Dekker. 1989, may 
be employed by the compression/decompression mod- 
ule 42 to process the message data. 
[0028] Next, simultaneously witti ttie recording and 
storing of ttie audio data, ttie nientity of ttie caller is 
determined by processing ttie caller's audio communi- 
cations and/or audio response to queries by the sys- 
tem 10. Spedfically. ttie callor's verbal statements and 
responses are received by the serv^ 20 and sent to 
weaker recognizer module 22, wherein such vert>al 
statements and re^x>nses are process^ and com- 
part with previously stored ^>eaker mod^s (step 1 20). 
If the speaker is identified by matching ttie received 
voice data with a previously stored voice model of such 
^>eaker (step 130). and if the s^em 10 is pre-pro- 
granvned to process calls based on the identity of a 
caller, the system 1 0 will then process the telephone call 
in accordance witti such pre-programmed procedure 
(step 152). 

10029] If. on the other hand, ttie speaker (e.g.. a first 
time caller) cannot be identffied via ttie previously 



9 



EP 0 935 378 A2 



10 



8tor^ voice models, speaker kJentif ication may be per- 
formed by both the speaker recognizer module 22 and 
the ASR/NLU module 26, whereby the content of the 
tel^)hone message may be processed by the ASR/NLU 
module 26 to extract the caller's name which is then 5 
compared with previously stor^ names to determine 
the id&Ttity of such caller (st^ 1 40). If the identity of the 
caller is then determined, the system 10 will process the 
telephone call in accordance with the kientity of the 
caller (step 152). 10 
[0030] In the event that the system 10 Is unable to 
klentify the caller from either the stored voice models or 
the content of the telephone message, the speaker rec- 
^ ognizer module 22 sends a signal to the server 20 
whtoh, in turn, prompts the caller to klentify him or her- is 
self with a query, eg.. "Who are you," (step 1 50) and the 
above identifk:ation process is repeated (step 120). The 
server 20 obtains the query in s^rithesized speech from 
speech synthesizer module 36 It is to be understood 
that, as stated above, the system 10 may be pro- 20 
grammed to initially prompt the caller to kJentify him or 
herself or ask details regarding the reason for the call. 
[0031 ] Once the caller or author has been identified by 
the speaker recognizer module 22, a signal is sent by 
the speaker recognizer module 22 to the switching mod- 25 
ule 28, whereby the switching module 28 proc^es the 
call or message bas^ on the identity of the caller or 
author in accordance with a pre-programmed procedure 
(step 152). If, on the other hand, the klentity of the caller 
ultimately cannot be kJentified, the system 10 may be so 
programmed to process the call t>ased on an unknown 
caller (step 154) by, e.g., fbnvarding the call to a voice 
mail. Such programming, to be further explained, is per- 
formed by the user 12 through the programming inter- 
face module 38. As stated abovOi the processing 3s 
options which the system 10 may be programmed to 
perfbmi include, but are not limited to, switching the call 
to another system, directing the call to another telecom- 
munication terminal (Rgs. 1 and 2, block 18) or directly 
handTmg the call by either connecting the call to a partic- 40 
ular party, disconnecfing the call, or pladng the caB on 
hoM (Figs. 1 and 2. btock 16). 
[0032] It is to be appreciated that whenever a new 
caller interacts with the system 10 for the first time, 
speaker models are built and stored in the speaker rec- 4S 
ognizer module 22. unless erased at the option of the 
user. Such models are then utilize! by the ^eakei' rec- 
ognizer module 22 for identification and verification pur- 
poses when that caller interacts with the system 10 at a 
sut>s^uent time. so 
[0033] It is to be appreciated that the system 10 may 
perfbrm speaker kientifkatkm by utilizing methods other 
than acoustic features when the requisite voice models 
do not exist. For example, with regard to telephone 
calls, the system 10 may utilize additional infomiation ss 
(e.g. caller ID) to enhance the accuracy of the system 
10 and/br to identify first time callers. 
[0034] As further explained bekw. the system 10 may 



be programmed to store the name and originating t^e- 
phone number of every caller (or specific callers). 
Such capab'lity allows tiie user to automatically send 
reply messages to callers, as well as dynamically create 
an address book (which is stored in the system 10) 
which can t>e suk>sequentiy accessed by the user to 
send a message to a particular person. 
[0035] It is to be understood that depending upon the 
applicatk>n, it is not necessary ttiat the system 10 per- 
form weaker recognition and natural language under- 
standing in real time Q.e., simultaneously witii the 
recording and during the time period of tiie actual tele- 
phone call) in every instanca For example, the system 
1 0 can be programmed to query the caller (via IVR pro- 
gramming) to obtain relevant information (l a. name 
and reason for call) at the inception of tiie call and store 
such information. The identification process may then 
be performed by the speaker recognizer module 22 or. 
the ASR/NLU module 24 subsequent to the call by 
retrieving the stored audio data from memory (step 1 18) 
(as indicated by the dott^ line in Rg. 3a) 
[P036] It is to be understood that any type of speaker 
recognition system may be utilize by tiie speaker rec- 
ognizer nrxxiule 22 for identifying the caller. Preferably, 
the ^aker recognition system employed in accord- 
ance with the present Invention is the system which per- 
forms text-ind^endent speaker verification an6 asks 
random questions, i.a, a combination of speech recog- 
nition, text independent speaker recognition and natural 
language understanding as disclosed in U.8. Serial No. 
08/871 ,784, f il^ on June 1 1 . 1997, and entiti^: 'Appa- 
ratus And Metiiods For Speaker V^ication / Identifica- 
tion / Classifk;ation Employing Non-Acoustic And/Or 
Acoustic Models and Databases," the disclosure of 
which is incorporated herein by referenca More partic- 
ularly, the text-ind^endent weaker verification system 
is preferably bas^ on a frame-by frame feature classifi- 
cation as disclosed in detail in U.S. Serial No. 
08/788,471 fOed on January 28. 1997 and entitied: "Text 
Independent Speaker Recognition for Transparent 
Command Ambiguity Resolution And Continuous 
Access Control," the disclosure of which is also incorpo- 
rated herein by reference. 

[P037] As explained in the above-incorporated refer- 
ence U.S. Serial No. 08/871.784, text-independent 
^aker recognition is preferred over text<lq3ertdant or 
text-prompted speaker recognition because text inde- 
pendence allows the speaker recognitton function to be 
carried out in parallel with other speech reoognition- 
bas^ functions in a mann^* tranqaarent to the caller 
without requiring interruption for new commands or 
tdentification of a new caller whenever a new caller Is 
encountered. 

[00^] Next, refening to Rg. 3b (and assuming the 
system 10 is programmed to process calls basedi on the 
identity of a caller or auttior), if it is determined ftmt tiie 
incoming call is a fecsimile or e-mail message, the mes- 
^ge data (e.g., inconvng e-mails or messages 
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retrieved from e-mail accounts) are processed by the 
ASR/NLU module 24 (step ISO), compressed (step 
192), arrd stored (step 194) in memory (not shown). 
With regard to e-mail messages, the data is direcUy 
processed (since such data is already in text format), s 
With regard to fecsimile messages, the ASR/NLU mod- 
ule 24 employs optical character recognition (OCR) 
using known techniques to convert the received lacsim- 
ile message into readable text p e.. transcribe the iac- 
simile nfiessage into an ASCII file). io 
[0039] Next, simultaneously with the transcribing and 
storing of the incoming message data, the identity of the 
author of such message may be determined via the 
ASR/NLU module 24 whereby the content of the incom- 
ing message is analyzed (step 200) to extract the is 
author's name or the source of the message, which is 
then compared with previously stored names to deter- 
mine the Identity of such author (step 21 0). If the author 
is identified (st^ 210), the message can be processed 
in accordance with a pre-programmed procedure based 2o 
on the kientity of the author (step 222). If. on the other 
hand, the Identity of the autha cannot be identifi^. the 
message may be processed in accordance with the pre- 
prc^rammed procedure for an unidentified author (step 
224). 25 
[C040] As stated above, it is to be understood that it is 
not necessary that the system 10 process the incoming 
or retrieved message In real time (i e., simultaneously 
with the transaibing of the actual message) in every 
instance. Processing may be peribrmed the so 
ASR/NLU module 24 subsequent to rec^ng the e-mai 
or f^imlle mes^ge data by retrieving the transcribed 
message data from memory (step 196) (as indicated by 
the dotted line in Rg. 3b). 

[0041] In addition to the kientity of the caller or author. 3S 
the system 10 may be further programmed by the user 
12 to process an incoming telephone call or facsimile or 
e-mail message based on the content and subject mat- 
ter of the call or message and/or the time of day in which 
such call or message is received. Referring again to 4o 
Rgs. 2, 3a and 3b^ alter receiving an incoming tele- 
phone call or e-mail or facsimile message, or alter 
retrieving a recorded message from an answering 
machine or voice mail, the server 20 sends the call or 
message data to the ASR/NLU nxxiule 24. In the casid 45 
of voice data (e.g. telephone calls or messages 
retrieved from voice mail or answering machine), the 
ASR/NLU module 24 converts such data into symbolic 
language or readable text As stated above, e-mail mes- 
sages are directly processed (since they are In readable so 
text format) and facsimile messages are converted into 
readable text O e.. ASCII f Hes) via the ASR/NLU module 
26 using known optical character recognition (OCR) 
methods. TTie ASR/NLU module 26 then analyzes the 
call or message data by utilizing a combination of 55 
speech recognition to extract certain keyword or topics 
and natural language understanding to detemnne the 
subject matter and content of the call (step 160 In Fig. 



3a for telephone calls) or message (step 200 in Fig. 3b 
for e-rmils and facsimiles). 

[0042] Once the ASR/NLU module deterrhines the 
subject matter of the call (step 170 in Rg. 3a) or the 
message (step 220 in Fig. 3b). a signal is then sent to 
the switching module 28 from the ASfVNLU nx)dule 24. 
wherein the call or message Is processed in accordance 
with a pre<letemtined manner based on the subject 
matter and content of the call (step 1 58 in Rg. 3a) or the 
content of the message {si&p 228 in Rg. 3b). For 
instance, if a message or call relates to an emergency 
or acdd&n, the switchir^ module 28 may be pro- 
grammed to transfer the call immediately to a certain 
indivkjuaL 

VmZ] In the event that the ASR/NLU module 24 is 
unak>le to determine the subject matter or content of a 
telephone call, the ASR/NLU module 24 sends a signal 
to the speech synthesizer 36 which, in turn, sends a 
message to the sisrver 20. to prompt the caller to articu- 
lato in a few words the reason for the call (step 180). 
e.g.. n/Vhat is the reason lor your call?" ^ain. it is to be 
understood that the system 10 may be programme to 
initially prompt the caller to state the reason for the call. 
If the system 10 is still unable to determine the subject 
matter of such call, the call may be processed in accord- 
ance with a pre-programmed procedure based on 
unknown matter (step 156) Likewise, if the subj^ mat- 
ter of ah e-mail or facsimile message cannot be deter- 
mined (step 220). the message may be processed in 
accordance with a pre-programmed procedure based 
on unknown matter (step 226). 
[0044] Further, In the event that an incoming call or e- 
nrtail message Is In a langi^ge foreign to the system 10 
(i.e.. foreign to the user), the ASR/NLU module 26 will 
signal the language kJentifier/taranslator module 26 to 
kJentify the partfoular language of the call or message, 
and then provkle the required translation to the 
ASR/NLU nxxJule 26 so as to allow the system 10 to 
understand the call and answer the caller in the proper 
language. It is to be understood that the system 10 m^ 
also be pre-jsrogrammed to process calls or messages 
with an unknown language In a particular manner. 
[0045] It is to be appreciated that any conventional 
technique for language kientlffoation and translation 
may be enployed In the present invention, such as the 
well-known machine language klentiffoation technk)ue 
disclosed in the article by l-lieronymus J. and Kadambe 
S., "Robust Spoken Language kJentifk^ation using 
Large Vocabulary Speech Recognition." Proceedings of 
ICASSP 97. Vol. 2 pp. 1111. as well as the language 
translation technque disclosed in Hutchins and Somers 
(1992): "An Introduction to Machine Translation.' Aca- 
6m\\c Press, London; (encyclopedic overview). 
[0046] In addition to the above refe-ences. language 
kientiflcation can be performed using several statistical 
methods. Rrst if the system 1 0 is configured to process 
a small number of different languages (ag., In Canada 
wheire essentially only English or French are spoken). 
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the system 10 may decode the input text in each of the 
different languages (using differ&it ASR systems). The 
several decoded scrqsts are then analyzed to find statis- 
tical patterns O-d.. the statistical distribution of decoded 
words In rach script |s analyzed). If the decoding was 
performed in the wrong language, the perplexity of the 
decoded script would be very high, and that particular 
language would be excluded from consideratiorL 
[0047] Next language identification may be performed 
on a phonetic lev^ where tiie system recognizes a set 
of phonemes (either using a universal phonetic system 
or several sterns for different languages). The system 
then estimates the frequencies of the decoded pho- 
neme sequences for each laniguaga If a particular 
decoded sequence is unusual, tiie system would 
exclude such language from consideration. There may 
also be some s^ences which are typical for a certain 
language. Using such factorSi the ^em win Identify 
the most prot^ble language. 
[0048] K is to aii^redated that the present invention 
may utilize tiie identity of tiie caller to perform language 
identification. Specifically, if tiie speaker profile of a cer- 
tain caller (which is stor^ in the system 10) indicates 
that the caller speaks in a certain language, this infor- 
mation may be a tector in identifying tiie language. Con- 
versely, if the system 10 identifies a particular language 
using any of tiie above mettiods, ttie system 10 may 
tiien determine the identity of a caller by searching the 
spieaker profiles to determine wtiich speakers use such 
kfentified language. 

[0049] It is to t)e understood tiiat both speech recog- 
nition and natural language understanding may be uti- 
lized by tiie ASR/NLU module 24 to process data 
received from tiie server 20. The present invention pref- 
erably employs the natural language understanding 
techniques disclosed in U.S. Serial No. 08/859,586, 
filed on May 20, 1997. and entiti^: "A Statistical Trans- 
lation System with Features Based on Phrases or 
Groups of Words." and U.S. Serial Na 08/593.032, filed 
on January 29. 1988 and entitled "Statistical Natural 
t^anguage Understanding Using Kidden dumpings." 
the discfosures of which are incorporated herein by ref- 
erence. TTie at)ove-incorporated inventions concern 
natural language understanding techniques for parame- 
terizing (i.e. converting) text input (using certain algo- 
rittims) into language which can be understood and 
processed by tiie system 10. For example, in the con- 
text of tiie present invention, the ASR corrponent of the 
ASR/NLU module 24 supplies tiie NLU component of 
such module with unrestricted text input such as "Pl£^ 
the first message from Bob." Such text may be con- 
verted by the NLU component of the ASR/NLU module 
24 into "retrieve-message(sendersBob. message- 
nun^&'sl)." Such parameterized action can ttien be 
understood and acted upon by the system 1 0. 
[0050] TTie known automatic speech recognition func- 
tions disclosed in tiie article by ZeppexifM, et al., enti- 
tied "Recognition of Conversational Telephone Speech 



Using The Janus Speech Engine." Proceedings of 
ICASSP 97, Vol. 3, pp. 1815 1997; and the known natu- 
ral langiage understanding functions disclosed in tiie 
article by K. Shirai and S. Furui, entifled "Spedal Issue 

5 on Spoken Dialog," 15. (3-4) Speech Communication, 
1994 may also be enployed in the present invention. 
Further, to simplify tiie programming of tiie ASR/NLU 
module 24, the keyword ^>otting based recognition 
mettiods as discfosed in "Word Spotting firom Continu- 

10 ous ^ech Utterances," Richard C. Cross, Automatic 
Speech and Speaks Recognition. Advanced Topics, 
pp. 303-327, ecfited by Chin-Hui Lee, Frank K, Soong. 
Kuldip K. Paiwal (Huwer Academic Publishers). 1996 
may preferably be used to guarantee that certain critical 

IS messages are sufTicientiy handled. 

[0051] It is to be appredaied tiiat by utilizing natural 
language understanding, as demonstrated above, the 
system 10 is capable of performing Interactive voice 
response (I VR) functions so as to establish a dialog witii 

20 the user or caller to provide dialog nmnagement and 
request understanding. This enables the system 10 to 
be utilize for order taking and dialog-based form filing. 
Further, such functions allow tiie caller to deckle how to 
process tiie caO (assuming the system 10 is pro- 

25 grammed accordingly), i.e., by leaving an e-mail or 
voice mail message, sending a page or transferring tiie 
call to another telephone nunrtiser. In addition, to be 
explained below, this allows tiie system 10 to be 
remotely programmed by the user through voice com- 

30 mands. 

[0052] It is to be further appreciated that the systerii 
1 0 provides security against unautiiorized access to tiie 
system 10. Particulariy. in order for a user to have 
access to and participate in ttie system 10, tiie user 

35 must go through the system'^ enrollment process. This 
process may be effected in various ways. For instance, 
enrollm^ may be performed remotely by having a new 
user call and enter a previously issued personal identifi- 
cation number (PIN), wherdsy the server 20 can be pro- 

40 grammed to re^nd to ttie PIN which is ir^ut into ttie 
system 10 via DTMF Keys on the new user's telephone. 
The system 10 can then t)uitd voice models of the new 
user to verify and identify the new user when he or she 
attempts to access or program tiie system 1 0 at a sut>- 

45 sequent tima Atternatively, eittier a recorded or live tel- 
ephone conversation of ttie new user may be utilized to 
txjild the requisite speaker models for future identifica- 
tion and verification. 

[Q053] It is to be appreciated tiiat the server 20 of ttie 
50 present invention may be structured in accordance with 
tiie teachings of patent application (IBM Docket Number 
Y0997-313) entitied "Apparatus and Mettiods For Pro- 
viding Repetitive Enrollment in a Plurality of Biometric 
Recognition Systems Based on an Initial Enrollment" 
55 the disclosure of which is incorporated by reference 
herein, so as to make the speaker models Ci.e., biomet- 
ric data) of authorized users (which are stored in ttie 
server 20) available to ottier biometric recognition 
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based systems to autoratically enrol! the user without 
the user ha\nr^ to systematically provide new biometiic 
models to enrollin such systems. 
[0054] The process of programming the system 10 
can be performed by a user eith^ locally, via a GUI 
interfoce or voice commands, or remotely, over a tele- 
phone line (voice commands) or through a network sys- 
tem connected to the system. In either event, this is 
accomplished through the programming interface 38. 
As demonstrated above, programming the system 10 is 
achieved by, ag., selecting the names of p8rs(»fis who 
should be trarisferr^ to a certain number, voice mail or 
answering machine, by inputting certain keywords or 
topics to be recognized by the system 10 as requiring 
certain processing procedures and/br by programming 
the system 10 to imm^iately connect emergerv^y calls 
or business calls t>etween the hours of 8:00 a.m. and 
12.-00 p.m. As shown In Rg. 2, the programming inter- 
face 38 sends such informatk>n to the sender 20, 
speaker recogrtizer ntodule 22. ASR/NLU module 26, 
language tdentifierAransIator module 24, audio 
indexer/jprioritizer module 34 and the switching nKxiuIe 
28, whrch directs the system 10 to process calls in 
accordance with the user's programmed instructions. 
[tOSSl The progranming interi^ce is responsive to 
either DTMF key signal or voice commands t)y an 
authorized user. The preferred niethod of programming 
the system 10 is through voice activated comrrands via 
a process of speech recognition and natural language 
understanding, as opposed to DIMF keying or via GUI 
interi^ca This process allows the system 10 to verify 
and klenitify the user before the user is provided access 
to the system 10. This provides security against unau- 
thorized users who may have knowledge of an other- 
wise valid PIN. Specifically, beXore the user can program 
the system 10 through vdce commands, the liser'^ 
voice is first received t)y server 20. and then kJentif led 
and verified by the speaker recognizer module 22. Once 
the user's identif k:atk>n is verified, the server 20 will sig- 
nal the programming intertece 38 to allow the user to 
proce^l with prc^ramming the system 10. 
[0056] The voice commands for programming the sys- 
tem 10 are process^ in the ASIWLU module 24. Par- 
trcularly, during such programming, the ASR/NLU 
module 24 is in a command and control mode, whereby 
every voice instruction or command received by the pro- 
gramming interface 38 is sent to the ASR/NLU module 
24, converts into symbolic language and interpreted 
as a command. For instance, if the user wants the sys- 
tem 10 to direct all calls from his wife to his telephone 
line, the user may state, e.g.. "Immediately connect an 
calls from my wife Jane.' and the system 10 will recog- 
nize and process such programming command accord- 
ingly. 

[0057] [Moreover, the user can estak>lish a dialog with 
the system 1 0 through the ASR/NLU module 24 and the 
speech synthesizer nrK3dule 35. The user can check the 
cun^ent program by asking the programming interface 



38, e.g., "What calls are transferrai to my answering 
machine." This query is then sent from the server 20 Of 
the user is calling into the system 10 from an outside 
line), or from the programming interfece 28 via the 

5 server 20 fif the user is in the office), to the ASFVNLU 
module 24, wherein the query is processed. The 
ASR/NLU 24 nrtodule will then generate a reply to the 
query, which is sent to the speech synthesize- 36 to 
generate a synthesized mes^ge, e.g.. "All personal 

10 calls are directed to your answerinjg machine," which is 
then play^ to the user. 

[pO^] Similariy, if the system 10 is unable to under- 
stand a veri^ programming request from an authorized 
user, the ASR/NLU ntodule 24 can generate a pron^ 

IS tor the user. e.g., "Please rephrase your request." and 
processed by the speech synthesizer 36. Specifically, 
during such programming, the sen/er 20 sends a pro- 
gramming r^uest to the programming interface 38. If 
the system 10 is unable to decipher the request the 

20 progranhming Interface 38 sends a failure meissage 
back to the server 20, which relays this message to the 
ASR/NLU module 24. The ASR/NLU module 24 may 
then either reprocess the query for a potential different 
meaning, or it can prompt the user (via the speech syn- 

25 thesizer 36) to issue a new programming request 
[0059] It is to be appreciated that the system 10 may 
be programmed to manage various messages and calls 
received via voice-mails, telephone lines, facsim- 
ile/n)odem. e-mail ard other teleoonrvnunication devices 

30 which are connected to the system 10 through the oper^ 
aiton of the audfo indexer4)rioritizer module 34. In par- 
ticular, the audio indexer/jprioritizer module 34 may be 
programmed to automatic^ly sort and index such mes- 
sages and telephone conversations according their 

35 subject matter and content, origin, or both, the system 
i 0 can pr^erably be further programnried so as to prior- 
itize certain calls and messages from a specif ic indivki- 
ual. 

[0060] Referring to Rg. 2, the audio indexing feature 
40 of the system 10 works as follows. Once the caller is 
kientif led and verified by the speaker recognizer module 
22, the speaker recc^nizer module 22 signals the ID 
tagger module 30 which automatically tags the identity 
of the caller or the identity of cun^ent speaker of a group 
45 of partk:^3ants to a teleconference. Simultaneously with 
the ID tagging process, the transcriber module 32 tran- 
scn'bes the tel^>hone conversation or message. The 
tagging process involves associating the transcribed 
mes^ge with the identity of the caller or speaker. For 
so instance, during teleconferences, each segment of the 
transaik)ed conversation corresponding to the current 
speaker is tagged with the kfentily of such speaker 
together with the begin time and end time for each such 
segment 

ss [0061] The information process^ in the ID tagger 
module 30 and the transcriber module 32 is sent to the 
audio ind6xerit)rioritizer ntodule 34, wher^'n the 
received information is processed and stored according 
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to a pre-programmed procedura The audio irKlexer/pri- 
oritizer module 34 can be programmed to irrdex the 
messages and conversations in any manner that the 
user desires. For instance, the user may be able to 
either retrieve the messages from a certain caller, 
retrieve all urgent messages, or retrieve the messages 
that relate to a specific matter. Further, the audio 
lndexe'4)riorrtizer nriodule 34 can t>e programm&j to pri- 
oritize calls from a caller who has either left numerous 
messages or has left urgent messages. 
[00S2] The information stor^ in the audio indescer/pri- 
orittzer module 36 can then be accessed and retrieved 
by the user either locally or remotely. When such infor- 
mation is accessed by the user, the audio indexer/jxior- 
itizer module 36 send the requested information to the 
speech synthesizer module 38, wherein a text*to- 
speech conversion is peribrm^ to allow the user to 
hear the message in the fonn of synthesized speech. It 
is to be understood that any cbnventioral speech syn- 
thesizing technique may be utilized in the present inven- 
tion such as the Eloquent engine provide with the 
commercially available IBM VI AVOICEGOLD software. 
[00S3] It is to be appredated that Infornr^tion may be 
retrieved from the audio indexer/jprioritizer module 34 
through various methods such as via GUI interiiace, 
PINs and DTMF keying. The prefen-ed method in the 
present inv&itlon for retrieving such inforrr^tion, how- 
ever, is through voice activated commands. Such 
method allows the system 10 to identify and verify the 
user before providing access to the messages or oon- 
versab'ons stored and indexed in the audio indexeryjpri- 
oritizer module 34. The audio indexeryjarioritizer module 
34 can be programmed to recognize and respond to 
certain voice commands of the user, which are proc- 
essed by the ASR/NLU module 24 and sent to the audio 
index0;4)rioritizer module 34. in order to refrieve certain 
messages and conversations. For example, the user 
may retrieve all the messages from Mr. Smith that are 
stored in the audio indexer^rioritizer module 36 through 
a voice command, e.g.. "Play all messages from Mr. 
Smitfi." This oommand is received by the server 20 and 
sent to the ASR/NLU module 24 for processing. If the 
ASR/NLU module 24 understands the query, the 
ASR/NLU MODULE 24 sends a reply back to the sender 
20 to process the query. The server 20 then signals the 
indexer4}rioritizer module 34 to send the . requested 
messages to the speech synthesizer to generate syn- 
thesized e-mail or facsimile messages, or directly to the 
sen/er 20 for recorded tel^hone or voice mail mes- 
sages, which are simply play^ back. 
[00S4] It is to k>e appreciated that various alternative 
programming strategies to process calls may be 
employed in the present invention by one of ordir^uy 
skill in the art For instance, the system 10 may be pro- 
grammed to warn the user in the event of an important 
or urgent incoming telephone call. Specifically, the sys- 
tem 10 can be programmed to notify the user on adis- 
play thereby allowing the user to make his own dedsfon 



on how to handle such call, or to sinply process the call, 
as demonstrated above, in accordance with a pre-pro- 
grammed procedure. Moreover, the system 10 can be 
programmed to fbnward an urgent or important call to 

5 the user's beeper when the user Is not home or is out of 
the offica The user may also program the system 10 to 
dial a sequence of telephone numbers (alter answering 
an incorrang telephone call) at certain locatfons where 
the user may be found durirtg the course of the day. Fur- 

10 thennore. the sequence Q.B., fist) of pre^ogrammed 
telephone numbers may be automatically updated t>y 
the syst&n 1 0 in accordance with the latest known loca- 
tion where the user is found. If the user decree, such list 
may also accessible by individuals who call into the sys- 

is tern 10 so that such callers can attempt to contact the 
user at one of the various locations at their conven-. 
ience. 

[DOSS] In addition, it is to be appreciate that the sys- 
tem 10 may be programmed to store the names of all 

20 persons who call the system 10, together with their tele- 
phone numbers (using ANl). as well as e-mail 
addresses of persons who send electronic mail. This 
allows the user of the system 10 to automatically reply 
to pending calls or messages without having to first 

25 detemv'ne the telephone number or e-mail addresses of 
the person to whom the user is replying. Further, such 
programming provides for dynamically creating a con- 
tinuously up-to-date address book which is accessible 
to an authorized user to send messages or make calls. 

so Specif icafiy, the user can access the system 10, select 
the name of a particular person to call, and tiien com- 
mand the system 10 to serid that person a certain mes- 
^ge (e.g., e-nmil or fecsimile). 
IPOSS] Furthermore, the system 10 may be pro- 

35 grammed to alfow the cafiers to access and utilize spe- 
dftc functions of tiie system 10. For instance^ the 
system 10 n^y off^r the caller the option to schedule a 
tentative appointment witii the user, which may then be 
stored in tiie system 10 and then sut)sequentiy 

40 accepted or rejected by the user. The caller may also be 
afford^ the opportunity to chose the method by which 
the user nay confirm, r^ect or adjourn such appoint- 
ment (e.g.. telephone call, facsimile or e-mail). Addition- 
ally, tiie system 10 nay be programme to provide 

45 certain authorized caller with access to the user's 
appointment calendar so that such appointments may 
be easily scheduled. 

[POST] It is to t>e further appreciated that the present 
invention may be employed in a small scale application 

50 for personal home use, or employed in a large scale 
office or corporate applications. It is to k)e f urttier appre- 
ciated t)y one of ordinary skill in the art tfat tiie system 
10 may be utilize in other applications. For instance, by 
utilizing the NLU feature of the system 10. the system 

55 10 may be connected to devices such as tape record- 
ers, radios and televisions so as to warn the user when- 
ever a certain topic is being covered on some channel 
or If a particular person is being inten^iewed. It is to be 
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understood that the system 10 is not limits to tele- 
phone conmunications. It is possible to use the system 
10 for web phones, net conversations, teleconferences 
and other various voice oomnruinications which involve 
the transmission of voice through a digital or analog 5 
channel. Additional electronic infornntion such as 
ASCII characters, facsimile message and the content 
of web pages and database searches can also be proc- 
essed in the same manner. For example, by adding opti- 
cal character recognition (OCR) with facsimile receiving w 
capabilities, the system 10 is able to transcribe the con- 
tent of messages receive t)y facsimile or e-maii to be 
stored in the audio indexer/prioritizer 34. As demon- 
strate akx3ve, the user may then retrieve these mes- 
sages through the speech synthesizer 36 to hear the is 
content of such messages. 

[00S8] In sum, the present invention provides a pro- 
gramn^e call and message processing system which 
can be programmed by a user to process incoming tel- 
ephone calls, e-nralls messages, facsimile messages 20 
and other electronic information data in a predeter- 
mined manner without the user having to first nBnually 
answer a telephone call or retrieve an e-mail or facsirrtile 
rhessage, identify the caller or the author of the mes- 
sage, and then decide how to transfer such call or 25 
re^nd to such messaga Tlie present invention can be 
programmed to transcribe tel^hone conver^tions or 
teleconferences, tag the identity of the caller or partici- 
pants to the teleconference, and store such messages 
and conversations according to the identity of the caller 30 
or author and/or the subject nratter and content of the 
call or message. The user may then retrieve any stored 
message or conversation based on the identity of the 
caller or a group of related messages based on their 
sutq'ect matter. 35 
Further features of the invention may be as follows: 
[00S9] The server means further receives, and is 
re^nsive to. one of an incoming facsimile message, e- 
mail message, voice data, data convertible to text and a 
combination thereof 40 
[QOTCq Ihe Weaker recognition means is based on 
text-independent speaker recognition. 
[0071 ] Tlie ^eech recognition means utilizes ^eech 
recognition and natural language understanding to 
determine said subject matter and content of ^id call. 4S 
[0072] The system includes language kJentification 
means, operatively coupled to said speech recognition 
means, for identifying and understanding languages of 
said incoming call. 

[0073] The identification means performs language so 
translation. 

[0074] The id&Ttity of said caller is determined from 
said identifie language of said call. 
[0075] The language identification means uses iden- 
tity of said caller to identify language of said call. ss 
[0076] Enrollment means and includes for enrolling a 
new user to have access to said system. 
[0077] The new user may be self-enrolled. 



[0078] Means are provided tor determining a time of 
said call and wherein said system may be further pro- 
grammed to process said call in accordance with said 
time of said call. 

[0079] The programming means include one of a 
GUI interface, a voice interface, a programnrting config- 
uration file, and a combination thereof 
[0080] The programming nrmy be performed one of 
locally, remotely and a combination thereof. 
[0081 ] Means are provided, responsive to said incom- 
ing call, for dynamically creating an address book. 
[0082] Means are provided for accessing said address 
book to send a message to a selected person. 
[0083] Processing of said call includes transferring an 
incoming telephone call to a plurality of different tele- 
phone numbers one of sequentially and simultaneously. 
[0084] Means are provided for prompting the caller to 
identify hfnr^erself and tiie sut)iect matter of said caB. 
Said prompting is performed when said system cannot 
determine eitiier said id&itity or said subject matter of 
call. 

[0085] Alternately said prompting is perform^ when 
said call is receive to determine said Identity of said 
caller and subj^ natter of said call. 
[0086] May further comprise means, operatively con- 
n^ed to said transcril>ing means, for dictating mes- 
sages from a user of ^id system and sending said 
message to a selected person. The message may be 
sent by one of a fecsimile, e-mail or telephone call, and 
a combination thereof, to saki selected person. 
[0087] May f urtiier comprise means for adding mood 
stanps or urgency/con! KJentiality stamps in a header in 
one of sakJ facsimile and e-mail. 
[Q088] The step of determining said Mentity of said 
caller may be performed by text-independent speaker 
recognition. 

[0089] The step of determining said subject matter of 
said call may be performed by ^eech recognition and 
natural language understanding. 
[0090] The method may include the step of translating 
said call into a langi^ge other than ttiat of said call. 
[0091 ] The incoming call may be recorded. 
[0092] Recording is perform^ simultaneously with 
said st^ of determining identity of said caller and may 
be performed prior to said step determining identity of 
said caller. 

[0093] May furtiier comprising the steps of: determin- 
ing a time of said call ; and processing said call based on 
said determined time of said call. 
IP094] The the st^ of retrieving said indexed informa- 
tion is performed by voice commands. 
[0095] The method may include determining the time 
of one of said call and message; and processing one of 
sakI call and message in accordance with said deter- 
mined time 
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Claims 

1 . An automatic call and data transfer processing sys- 
tem, comprising: server means (20) for receiving an 
incoming call; characterised by s 

speaker recognition means (22), operatively 
cdipled to said server means, for identifying 
caller of said call; 

speech recognition means (24), operatively io 
coLpled to said server means, for determining 
subject matter and coitent of said call; 
switching means (28), responsive to said 
speaker recognition means and speech recog- 
nition means, for processing saki call in accord- is 
ance with one of sakI identifk^ation of said 
caller and determined subject matter; and 
programming means (38), operatively coupled 
to said server means, said speaker recognition 
means, said speech recognit'on means and so 
said switching means for programming system 
to perform said processing. 

2. A system of claim 1 , characterised in that the server 
m^ans includes means for recording (40) said 2s 
incoming call. 

3. A system of daim 2, characterised in that said 
server means further includes means (42) for com- 
pressing and storing sakI recorded data and means so 
for decompressing sakl compre^ed data. 

4. A system of daim 1,2 or 3 further charecterised by 
identification tagging means (30), responsive to 
sakI speaker recognition means, for automatically ss 
tagging said identity of said caller; transcribing 
means (32), responsive to said speech recognition 
means, for transcribing a telephone conversation or 
message of said caller; and audk) indexing means 
(34). operatively coupled to said identification tag- 4o 
ging means and sakJ transaibing means, for index- 
ing sakl messages and said conversations of said 
caller according to subject matter of saki conversa- 
tion and said message and ttie identity of said 
caller. 4S 

5. A system of daim 4 further characterised by means 
for retrieving (118) said indexed messages from 
said audio indexing means. 

so 

6. A system of daim 2, 4 or 5, further characterised by 
speech synthesizer means (36) operatively coupled 
to said server means, sakl speech recognition 
means and said audio indexing means, for convert- 
ing information stored in sakl audfo indexing means ss 
into synthesized speech. 

7. A method for provicfing automatic call or message 



data processing, characterised by determining ttie 
klentity of said caller (130) from an incoming call; 
determining the ajbject matter of said call (170); 
processing (152, 154. 156. 158) said call in accord- 
ance with one of said identity of saki caller and sub- 
ject matter of said caD. 

8. A method fbr provkiing automatic call or message 
data processing, comprising the steps of: receiving 
one of an incoming call and message data (100); 
kientifying a caller of sakl call if an incoming call is 
received (130) and detemnining subject matter of 
said caO (160): identifying an autfw of said mes- 
sage if message data is recdved and determining 
subject matter of said message; processing (152, 

. 154, 156. 158) one of saki call and message in 
accordance witii one of saki identity of said caller 
and author and sakl subject matter of saki call and 
message. 

9. The method further characterised by the st^ of: 
tagging said determined klentity of one of saki 
caller and said author; transcribing said determined 
subject matter of one of said call and said niessage; 
indexing flie Information resulting from sakl tagging 
and said transcribing in accordance with one of sakl 
determined subject matter, said determined identity 
and a combination tiiereof. 

ia A method of claim 9 characterised by retrianng 
saki indexed information and converting said 
indexed information into syntiiesized speech. 
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(54) System and methods for automatic call and data transfer processing 



(57) A programmable automatic call and data trans- 
fer processing system which automatically processes 
incoming telephone calls/facsimiles and e-mails based 
on the identity of the caller or author, the subject matter 
of the message or request, and/or the time of day, which 
Includes: a central server for automatically answering 
an incoming call and collecting voice data of a caller; a 
speaker recognition module connected to the sender for 
identifying the caller or author; a switching module re- 
sponsive to the speaker recognition module for process- 
ing the call or message in accordance with a pre-pro- 
grammed procedure based on the identification of the 
caller or author; and a programming interface for pro- 
grarnming the server, speaker recognizer module and 
the switching module. The system is programmed by the 
user to so as to process incoming telephone calls or e- 
mall and facsimile messages based on the identity of 
the caller or author, subject matter and content of the 
message and the time of day. Such processing includes. 



but is not limited to, switching the call to another system, 
fonvarding the call to another telephone tenninal, plac- 
ing the call on hold, or disconnecting the call. In another 
aspect of the present Invention, the system may be em- 
ployed to process infomnation retrieved from other tele- 
communication devices such as voice mail, facsimile/ 
modem or e-mail. The system is capabfe of tagging the 
identity of a caller or participants to a teleconference, 
and transcribing the teleconferences, phone conversa- 
tions and messages of such callers and participants. 
The system can automatically index or prioritize the re- 
ceived calls, messages, e-mails and facsimiles accord- 
ing to the caller identificatton or subject matter of the 
conversation or message, and allow the user to retrieve 
messages that either originated from a specific source 
or caller or retrieve calls which deal with similar or spe- 
cific subject matter. 
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